Intelligent system and methods of recommending media content items based on user preferences

ABSTRACT

A system and method for making program recommendations to users of a network-based video recording system utilizes expressed preferences as inputs to collaborative filtering and Bayesian predictive algorithms to rate television programs using a graphical rating system. The predictive algorithms are adaptive, improving in accuracy as more programs are rated.

BACKGROUND OF THE INVENTION

1. Field of the Invention

The present invention relates to automated systems and methods forrecommending items to users. More particularly, the invention relates toan adaptive network-based system and methods for predicting ratings foritems of media content according to how likely they are to appeal to auser. The invention integrates multiple prediction algorithms andprovides heuristics for selecting the most suitable algorithm for makinga prediction for any single item, creating a suggestion and ratingsystem having exceptional robustness and predictive accuracy.

2. Description of the Prior Art

The prior art provides various systems for filtering, suggesting andrating. Filtering, suggesting and rating, while they may employ similarmethods, constitute separate challenges. In the presence of a largeamount of content, be it merchandise, or videos, or newsgroup articles,filtration systems aim to limit the amount of content a user deals with,by presenting only that content that correlates most closely with theusers preferences. As such, their essential function is one ofexclusion. Suggestion systems aim to direct a user's attention to itemsthey may not have been aware of that are likely to appeal to them, basedon their preferences. Thus, their essential function is one ofinclusion. Rating systems assign ratings to content items, according toa user's expressed preferences. Hence, their essential function is oneof ordering. Occasionally, systems are provided that are capable ofperforming more than one of the essential functions of excluding,including and ordering.

Some of the prior art examples are adaptive in nature; that is, they arecapable of accommodating themselves to changing conditions, in a processthat is analogous to learning. For example, over time, a user'spreferences may change, sometimes gradually, and sometimes abruptly.Adaptive systems have the capacity to adapt to a user's changingpreferences without any explicit input from the user. Often, adaptivesystems must be taught, that is, their engines must be initialized withstarting values. The teaching process usually consists of the userexplicitly indicating their preferences. After being taught, adaptivesystems make inferences by monitoring various implicit indicators or theuser's preferences.

H. Strubbe, System and method for automatically correlating userpreferences with a T.V. program information database, U.S. Pat. No.5,223,924 (Jun. 29, 1993) and H. Strubbe, System and method for findinga movie of interest in a large movie database, U.S. Pat. No. 5,483,278(Jan. 9, 1996) provide systems for rating movies and television programsby a user and correlating program information for unrated programs withthe user's ratings, so that a program database customized to the usermay be created. To rate the programs, the user accesses programinformation either by time slot or channel, and assigns a Boolean ratingof “like” or “dislike.” A free-text search algorithm searches a textsummary in the program information records rated by the user. Thesignificant words of the text summary are tallied and weighted. Afree-text search of unrated records is performed, and a retrieval valuesis computed. Records with retrieval values are judged to be programslikely to appeal to the user and are added to the database of preferredprograms. While the described systems effectively allow the user tofilter television programming and movies, it would be desirable toprovide a scalar rating system, in which the user is able to expressdegrees of preference rather than a simple ‘yes’ or ‘no.’ Sincefree-text searches are computationally expensive, it would beadvantageous to provide a more efficient, content-based algorithm. Itwould also be desirable to provide different types or predictivealgorithms, thereby increasing prediction accuracy. In addition toassigning overall ratings to programs, it would be a great advantage toprovide the user with the capability of rating individual programfeatures, such as the actors or the director.

F. Herz, J. Eisner, L. Ungar, M. Marcus, System for generation of userprofiles for a system for customized electronic identification ofdesirable objects, U.S. Pat. No. 5,754,939 (May 19, 1998) describes aclient server-based system for retrieving items of interest to a user.An interest summary of the user is prepared, by querying the user abouttheir interests. Each target item available over the network isdescribed by a target profile. Target profiles are compared to eachother and clustered according to similarity. Clusters and individualtarget items are compared with the user interest summary. Items likelyto be of interest to the user are presented in a ranked listing. Theuser profile is stored on a proxy server, and security measures areprovided to safeguard the user's identity. Relevance feedback isprovided by monitoring which items a user expresses interest in. Whileefforts are made to preserve the user's confidentiality through varioussecurity measures, it would be desirable to provide a system in whichthe user's profile is stored locally, on the client side, andcommunication between the server and the client is stateless, so thatthe server is completely ignorant of the user's identity. It would alsobe desirable to provide a prediction engine on the client side, againrendering a stateful connection between client and server unnecessary.In addition to implicit relevance feedback, it would be an advantage toallow the user to correct their profile, thus allowing even greaterpredictive accuracy.

G. Graves, B. O'Conner, A. Barker, Apparatus and method of selectingvideo programs based on viewer's preferences, U.S. Pat. No. 5,410,344(Apr. 25, 1995) describe a method for selecting television programsaccording to expressed viewer preferences that employs an adaptiveprediction algorithm. Television programs are described in terms ofattributes. A viewer explicitly rates different attribute-value pairs,also known as features. Based on these explicit viewer ratings, a neuralnetwork rates television programs. Programs with a high enough score areautomatically recorded for viewing at a later time. The describedmethod, however, must use explicit ratings, it does not employ orgenerate implicit ratings. Furthermore, the described method providesonly a single prediction algorithm, limiting its versatility androbustness.

J. Hey, System and method for recommending items, U.S. Pat. No.4,996,642 (Feb. 26, 1991). Employs a conventional collaborativefiltering algorithm to recommend movies to a customer from the inventoryin a video store. The customer uses a scalar rating system to ratemovies they have viewed. The resulting profile is paired with profilesof other customers who have rated at least a portion of those selectionsrated by the first customer, and an agreement scalar is computed foreach of the pairings. Based on these pairings, a group of recommendingcustomers is defined for the first customer. D. Payton, Virtualon-demand digital information delivery system and method, U.S. Pat. No.5,790,935 (Aug. 4, 1998) describes a digital information system thatdelivers virtual on-demand information over digital transport systems. Acollaborative filtering algorithm predicts content items that might beof interest to each subscriber. A. Chislenko, Y. Lashkari, D. Tiu, M.Metral, J. McNulty, Method and apparatus for efficiently recommendingitems using automated collaborative filtering and feature-guidedautomated collaborative filtering, U.S. Pat. No. 6,092,049 (Jul. 18,2000) describe a method for recommending items to users using automatedcollaborative filtering. As with the other references described, aconventional collaborative filtering implementation, in which users arecorrelated to other users, is provided. B. Miller, J. Konstan, J. Riedl,System, method and article of manufacture for utilizing implicit ratingsin collaborative filters, U.S. Pat. No. 6,108,493 (Aug. 22, 2000)describe a prediction information system utilizing collaborativefilters. Unlike most collaborative filtering implementations, whichoperate on explicit ratings, the described system utilizes implicitmeasures. The accuracy of prediction attainable with collaborativefiltering has been shown to be quite high. Nevertheless, conventionalcollaborative filtering implementations all require maintaining userinformation in a central place, such as on a server, leading to concernsabout the user's privacy. Subsequently, similarities between pairs ofusers are computed on the server. It would be desirable to provide acollaborative filtering implementation based on similarity between pairsof items, rendering it unnecessary to maintain user information on aserver, and eliminating the necessity of exchanging state informationbetween client and server.

D. Whiteis, System and method for recommending items to a user, U.S.Pat. No. 5,749,081 (May 5, 1998) describes a system for recommendingitems of merchandise to a customer at the point of sale based on itemsalready selected. Unlike the collaborative filtering implementationsdescribed above, the Whiteis system correlates items, rather than users,by tracking the number of times a pair of items occurs together in thesame purchase. Based on the number of times a pair occurs, an adjustedweight is calculated that is taken to be an index of similarity betweenthe two items of the pair. The described system is simple and easilyimplemented and is well suited for point-of-sale use. However, sincesimilarity is calculated simply on whether a pair occurred in the samepurchase, it can only be a very general approximation of similarity. Forexample, in a video store, a father may be select “Lion King” for hischildren and “Body Heat” for he and his wife. In the present system,that purchase would be listed as a correlating pair, albeit a weaklycorrelating pair if it did not occur frequently in the total populationof pairs. Furthermore, it would be an advantage to filter the weights toeliminate pairs that correlate weakly. It would also be an advantage toprovide information about pairs that anti-correlate.

A. Lang, D Kosak, Information system and method for filtering a massiveflow of information entities to meet user information classificationneeds, U.S. Pat. No. 5,867,799 (Feb. 2, 1999) provide an apparatus,method and computer program product for information filtering in acomputer system receiving a data stream from a computer network. Severallayers of adaptive filtering are provided, both content-based andcollaborative, to ensure that a receiver receives only those contentitems that correlate very highly with their preferences. There areindividual user filters and community filters. The essential function ofthe system according to Lang, et al. is overwhelmingly one of exclusion,with the multiplicity of filter layers. However in a system, the aim ofwhich is to predict items most likely to appeal to a user, and suggestitems likely to appeal to a user, the redundant filtering of the presentsystem would limit the amount of content available to the user, thuslimiting user choices rather than providing new and unexpectedalternatives.

Thus, there exists a need in the art for a system for predicting arating for an item according to how much it will appeal to a user. Itwould be advantageous to provide multiple prediction engines that arecapable of providing the most accurate prediction for any particularitem. It would be highly desirable to provide a convenient userinterface for teaching the system the user's preferences. Furthermore,it would be an advantage for the system to have an adaptive capability,so that it can learn and adapt to shifts in user preferences. It wouldbe desirable to provide a distributed collaborative filtering enginethat guaranteed a user's privacy by eliminating the necessity ofcorrelating the user to other user's or groups of users. It would be agreat advantage to calculate similarity between items, rather thanbetween users and to perform such calculation on the client side,eliminating the necessity of a stateful connection between the serverand the client. It would be a significant technological advance toprovide an adaptive modeling prediction engine that accepted bothexplicit user ratings and had the capability of inferring user ratingsin the absence of explicit ratings. It would be a great convenience todisplay the output of the various prediction engines in a single,integrated list.

SUMMARY OF THE INVENTION

The invention provides a network-based intelligent system and method forpredicting rating for items of media content according to how likelythey are to appeal to a user based on the user's own earlier ratings.Collaborative filtering and content-based prediction algorithms areintegrated into a single, network-based system. System heuristicsdetermine which of the provided algorithms provide the most reliablepredictor for any single new content item.

In a preferred embodiment of the invention, a network-based videorecording system rates television programs according to the likelihoodthat they will appeal to a user, based on the user's own previousratings of television programming. Individual recording units, clients,are in intermittent communication with a server. A user interface isprovided in which the user teaches the system by recording theirprogramming preferences. Using an interactive rating system that employsa “thumbs up” and “thumbs down” metaphor for favorable and unfavorableratings, respectively, individual users may give an overall rating to aprogram, or they may rate individual features of the program: forexample, directors, actors, and genres; provided in interactive lists.The users preferences are then used as inputs to one or more predictivealgorithms.

A collaborative filtering algorithm is provided, in which individualitems are correlated to each other, instead of the usual approach ofcorrelating users to each other. Lists of rated items are periodicallyuploaded from individual clients to the server. The ratings areextracted from the lists and stored in matrices on the server,eliminating the necessity of keeping client state information on theserver, thus advantageously providing an important privacy safeguard.The server computes correlation factors for pairs of programs andprovides them to the client in a correlating items table. The clientsearches the table for pairs containing programs already rated by theuser. Thus, other programs not rated by the user, which correlate to therated program, can be assigned a similar rating.

In the absence of up-to-date correlating items data, an adaptivemodeling algorithm is also provided that works by using content-basedfiltering: in particular it uses the features of a program and a user'sprior preferences on those features to arrive at a prediction of howmuch the user would like a program. In the absence of explicit ratingsof a program's features, a modified naïve Bayes algorithm infers ratingsof the program features based on previous ratings by the user ofprograms containing at least one of the features. Based on the inferredfeature ratings, a prediction is made of how much the user would likethe program. Unlike conventional implementations of the naïve Bayesalgorithm, the invention extends the Bayes algorithm to handlemulti-valued features, i.e. a program generally has more than one actor,or it may be classified in more than one genre. Additionally, theinvention provides for the integration of explicit advice, the expresseduser preferences, with inferred ratings.

The user is also provided with the capability of correcting preferences,either explicit ones or inferred ones. Thus, the user may optimize thepreference profile in order to obtain predictions that coincide withtheir expectations more closely.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 provides a block diagram of the functional architecture of anetwork based system for predicting the likelihood that a an item ofmedia content will appeal to a user based on previous ratings of contentitems by the user, according to the invention;

FIG. 2 shows a screen from a user interface to the system of FIG. 1,wherein suggested items are displayed to a user, and access is gained toa user interface, wherein a user teaches the system the user'spreferences, according to the invention;

FIG. 3 shows a top-level screen of the teaching interface of FIG. 2,according to the invention;

FIGS. 4 and 5 show screens from the teaching interface of FIG. 2 forteaching content category and sub-category preferences, according to theinvention;

FIG. 6 shows a screen from the teaching interface of FIG. 2 for teachingactor preferences, according to the invention;

FIG. 7 shows a screen from the teaching interface of FIG. 2 forcorrecting user ratings and predicted ratings of actors, according tothe invention;

FIG. 8 shows a screen from the teaching interface of FIG. 2 for teachingprogram preferences, according to the invention;

FIG. 9 shows a screen from the teaching interface of FIG. 2 forcorrecting user ratings and predicted ratings of programs, according tothe invention; and

FIG. 10 shows the block diagram of a distributed system collaborativefiltering prediction system, within the system of FIG. 1, according tothe invention.

DETAILED DESCRIPTION OF THE INVENTION

Referring now to FIG. 1, shown is a block diagram of an intelligent,distributed system for recommending items of media content to a user,based on the user's expressed preferences. Although FIG. 1 illustrates asingle client, such illustration is understood to be exemplary only. Oneskilled in the art will readily appreciate that the system includes aplurality of clients. A client 10, over a conventional networkconnection 12, is in intermittent communication with a server. A userinterface 14 is provided, wherein the user teaches the system the user'spreferences concerning programs, categories of programs and programfeatures. When the system has built a sufficient knowledge base, aseries of prediction engines having an adaptive capability, predictsratings for unrated program items, based on the user's expressedpreferences. The preferred method employs a novel, client-sidecollaborative filtering engine 17. Lists of items rated by the user 15are transmitted to the server 11, where they are aggregated, with therated items information from many other users, into a single list. Alisting of correlating items is generated 19 and transmitted back to theclient 10, where the collaborative filtering engine predicts ratingsbased on the correlation provided by the server, and the user's previousratings.

It may happen that up-to-date correlation information is unavailable. Insuch event, an adaptive, content-based prediction engine 18 predictsratings of unrated program items. Preferably, the content-based engineemploys explicit user ratings of various program features as inputs.However, in the absence of explicit ratings, a naïve Bayes classifierinfers ratings from which a rating is predicted.

The invention is created and implemented using conventional programmingmethods well-known to those skilled in the arts of computer programmingand software engineering.

Method of Teaching by Users

The invented system allows users to rate an item, from −3 to 3 (7levels), wherein negative ratings are unfavorable and positive ratingsare favorable. Ratings are expressed using a graphical metaphor, inwhich “thumbs up” indicate favorable ratings and “thumbs down” indicateunfavorable ratings. 0, indicated by an absence of thumbs, is a neutralrating. Typically, the user assigns thumbs to a program or a programfeature by depressing a button on a remote control that is provided withthe client unit. Referring to FIG. 2, a suggestion screen 20 of theinvented system displays a listing of suggested programs 21 accompaniedby their rating icons 22. Close to the top of the screen, beneath thebanner, a selection bar is positioned over a menu item 23 that grantsaccess to the teaching screens. The suggestions 21 are displayed in adescending sort according to the number of thumbs 22. An arrow-shapedcursor 24 allows the user to scroll through the entire list ofsuggestions. Referring now to FIG. 3, the first screen 30 of a userinterface for assigning ratings to television programs and individualprogram features is shown. Selections for rating program category 31 orgenre, individual programs 32, actors 33 and directors 34 are provided.The user manipulates the various interface elements by means of theprovided remote control. As FIGS. 4 and 5 show, selecting the ‘teachcategory’ option 31, navigates the user to a ‘teach category’ screen 40and subsequently to a ‘teach sub-categories’ screen 50. Selecting anyone of the displayed categories or sub-categories allows the user toassign ‘thumbs’ ratings to the selected categories, although the displayis not immediately redrawn to reflect the user's ratings. Upon selectingthe ‘teach actors’ option 33, the user is presented with several furtheroptions, two of which are shown in FIGS. 6 and 7. As previouslyindicated, the user's preferences, expressed as ratings, are necessaryas input to the various predictive algorithms of the invention. Beforethe algorithms are able to start providing the user with predictiveratings, they must be “taught” or initialized with a minimum amount ofuser preference data. In the course of viewing or selecting a programfor recording, viewers may assign a rating to a program. By assigningratings in this way, however it can take a fairly long time before thepreference database has accumulated enough data to teach the predictivealgorithms. In order to accelerate the process of accumulatingpreference data, the various teaching screens with their lists areprovided, so that the user may initially go through the lists andsystematically rate a threshold number of programs and individualfeatures. To facilitate the process of working through the feature list,lists of varying lengths are provided. For example a ‘Teach famousactors’ screen 60 provides a compact list 61 of high-profile actors.Thus, the process of rating actors, by the user is greatly facilitated.Additionally, a ‘Teach all actors’ screen (not shown) provides acomprehensive list of actors from which the user may also work. FIG. 6also shows the manner of assigning ratings. The user is provided with amenu of possible ratings 62 from which they select a rating for thecorresponding actor. After selecting the rating, the selected rating ishighlighted. As previously described, the user interface is notimmediately repainted to reflect the user's rating selection.

In FIG. 7, a ‘Correct rated actors’ screen 70 is shown. While theexample shown indicates that no actors have yet been rated, if actorshad been rated, the user would be presented with a list of rated actors,similar to lists already shown. As will be described in greater detailfurther below, provision is made for distinguishing betweenuser-assigned ratings and predicted ratings assigned by the system.Separate, similar, but distinct icons are provided for user ratings 71and predicted ratings 72. In the event that actors had been rated, theuser would be provided with a single, aggregate list of all ratedactors, both user-rated, and those for whom the system has predictedratings. Thus, the user may correct their own ratings, and they mayrevise predicted ratings. Advantageously, the ability to view andcorrect predicted ratings allows the user to browse the basis featuresthat were used by the system to make a particular prediction for a show.This will allow the user to directly modify his or her preferenceprofile in order to obtain predictions more in line with what he or sheexpected. As FIGS. 8 and 9 show, the process of rating programs andcorrecting rated programs is almost identical to that for actors.

While screens are not shown for all attributes, the process of ratingattributes and correcting them is virtually identical across the entireselection of attributes. The preferred embodiment of the inventionprovides ‘actors,’ ‘genre’ and ‘directors’ as program attributes.However, the list of attributes need not be so limited. The manner ofproviding the user interface employs conventional techniques; ofcomputer programming and graphics display commonly known to thoseskilled in the arts of computer programming, software engineering, anduser interface design.

As described above, as a preference profile is built, the user mayexplicitly rate programs and individual programs, and he or she maycorrect ratings, either their own, or predicted ratings. Additionally,other system heuristics may apply a rating to an item. For example, whena user selects a program to be recorded, the system automaticallyassigns one thumb up, corresponding to a rating of one, to that item ifthe user had not already rated the program. Other heuristics are basedon whether a program was watched after it was recorded, and for howlong.

Predicting Program Ratings

As previously described, the user teaches the system his or herpreferences by assigning overall ratings to programs they are familiarwith, and rating individual program elements, such as actors and genres.Subsequently, the preferences are fed to one or more predictivealgorithms to assign ratings to programs that predict the likelihood ofthe user liking them. The preferred embodiment of the invention includesa collaborative filtering algorithm and a content-based adaptivemodeling algorithm.

The total number of programs available to the user may be considered tobe a pool, or a population of items. As previously described, the userassigns ratings to a subset of that pool using discrete ratings. In thepreferred embodiment of the invention, the rating is measured as anumber of thumbs from −3 to 3, with 0 connoting a neutral rating. Therealso exists a pool or population of program elements, or features, aportion of which have been rated by the user according to the samerating system.

Collaborative Filtering

The purpose of collaborative filtering is to use preferences expressedby other users/viewers in order to make better predictions for the kindsof programs a viewer may like. In order to rate how much the currentuser will like a program to be rated, “Friends,” for example, thecollaborative algorithm evaluates the other programs that the user hasrated, for example, two thumbs up for “Frasier,” and uses thecorrelating items table downloaded from the server to make a predictionfor “Friends.” The correlating items table may indicate that “Friends”and “Frasier” are sixty-six percent correlated and “Friends” and“Seinfeld” are thirty-three percent correlated. Assuming that the userhas expressed 2 thumbs up for “Frasier” and 1 thumb up for “Seinfeld,”the algorithm will predict 1.6 thumbs up for “Friends,” closer to 2thumbs up than 1 thumb up. This prediction will be rounded to 2 thumbsup in the user interface, and thus the prediction is that the user willlike “Friends” to the extent of 2 thumbs up.

The invented implementation of collaborative filtering provides thefollowing advantages:

-   1. No person-to-person correlation.    -   The server 11, which collects “anonymized” preferences profiles        from the individual clients, does not as is conventionally done,        compute a correlation between pairs of users. Instead, it        computes a correlation between pairs of programs. Thus, no        sensitive or personal user information is ever kept or needed on        the server. Preferences information is http posted from the        client to the server; once the network connection is terminated,        the server has an anonymous set of preferences—it doesn't matter        whose preferences they are. In order to guarantee the user        anonymity, the entire preference database of each client is        periodically uploaded to the server. Thus there is no need to        issue cookies or maintain any client state information on the        server.-   2. Distributed, or local computation of program ratings.    -   Pairs of programs are evaluated that:        -   are sufficiently highly correlated to be good predictors of            each other, and        -   that have been rated by enough viewers.    -   The server then transmits to each client the correlations for        each significant program pair in a correlating items table.        Next, the client filters these pairs to find only those pairs        for which one of the programs in the pair has been rated by the        user Thus, continuing with the example above, the server may        have calculated a high correlation between “Spin City” and        “Friends,” based on the preferences information from thousands        of other users. However, since the user at hand has rated        neither “Spin City” nor “Friends,” that correlation is not        useful, therefore the client will filter that pair. On the other        hand, since the user has rated “Frasier” and “Seinfeld,” the        pairs    -   [Frasier->Friends 0.66] and [Seinfeld->Friends 0.33] are        retained and used as inputs to the collaborative filtering        algorithm.-   3. The architecture of the collaborative filtering server is highly    parallellizable and consists of stages that pre-filter shows and    pairs of shows so that computing correlation, a computationally    expensive process, for all pairs may be avoided.-   4. Carouselling of correlations from server to clients.    -   The tables of correlating items are broadcast daily, but since        correlations between pairs do not change drastically from day to        day, each client only processes correlations periodically,        eliminating the necessity of recomputing correlations on a daily        basis.-   5. Robustness of collaborative filtering engine.    -   Each stage of the collaborative filtering engine may be        implemented by several computers. Thus, if one computer is        non-functional for a short period of time, correlation        computations allocated to that computer will queue up at its        feeder computer, and the process is delayed somewhat without a        noticeable disruption of service to the user. Additionally,        since correlations do not change greatly from month to month,        losing a portion of the correlations only results in a graceful        degradation of prediction quality, and not a catastrophic loss        in quality.

FIG. 10 provides a diagram of the functional architecture of adistributed collaborative filtering engine, according to the invention.A client 10 is in intermittent contact with a server 11 over a networkconnection 12. As indicated above, the server-side architecture may beimplemented across several computers, or it may be implemented on asingle computer having multiple functional units. In the preferredembodiment of the invention, the network connection is either a dial-upconnection over publicly available telecommunications networks or asatellite connection. However, other types of network connections knownto those skilled in the art are within the spirit and scope of theinvention. According to the preferred embodiment of the invention, theclient 10 and the server 11 are in communication for a brief period on adaily basis so that the client may transmit rated items 15 to the server11 and receive the correlating items table 104 b broadcast from theserver 11 periodically. A listing of unrated items 16, wherein theunrated items consist of television programs, is resident on the client10. The list of unrated programs is presented to the user on aconventional display means 25. The display means may be a televisionscreen, A CRT monitor, an LCD display, or any other generally knowndisplay means. During the period of daily contact with the server, thelist of unrated items 16 is updated. An interactive user interface 14,allows the user to rate items known to the user using the graphicalrating system previously described. As items are rated by the user, therated items are saved to a listing of rated items 15. In the preferredembodiment of the invention, the listing of rated items 15 exists as aconventional table. However, other commonly known data structures, suchas delimited text files, would be equally suitable.

When the client 10 and the server 11 are in communication, the list ofrated items 15 is transmitted to the server. The correlating items table104 a is generated using the rated items 15 input of all users, or arandom sample of them. A frequency filter 101 blocks items and pairs ofitems that have not been rated by a sufficient number of users, thusminimizing the storage required for the pair matrices 102. The filterthresholds also serve to assure a minimum quality of the correlationcalculations, since they get more accurate with more input due to theiradaptive nature. Filtering takes place in two stages. The first stagetracks the frequency of any unique item. If the frequency is too low,the item and user rating are not considered. The second stage monitorsfrequencies of unique pairs. If a pair's frequency is too low, the pairis not longer considered.

Pair matrices store the user ratings in an n by n matrix, where n is thenumber of distinct levels on the rating scale used by users. Aspreviously described, in the current embodiment of the invention, thereare seven discrete ratings: −3, −2, −1, 0, 1, 2 and 3. Thus, each pairmatrix is a 7.times.7 matrix. The foregoing description is not intendedto be limiting. Other rating scales, resulting in matrices of otherdimensions are also suitable. Provided below is an algorithm for storingratings to the pair matrices 102:

-   1. Get a pair of ratings for a pair of items.-   2. Find the matrix for that pair, and create if it does not exist    yet.-   3) Using the rating pair as index into the matrix, locate the cell    for that rating pair.-   4) Increment the cell's value by one.-   5) Go to 1.

Using commonly known analytical methods, a correlation factor iscalculated from the pair matrix for each program pair. Every matrixyields a correlation factor from approximately −1 to 1.

The correlation filter 103 prevents pairs of items from being consideredthat have correlation factors close to 0. Thus, pairs that onlycorrelate weakly are not used. Those pairs that pass the correlationfilter 103 are assembled into a correlating items table 104 a, whichlists all other significantly correlating items for every item. Thatlist, or a part thereof, is distributed back to the client 10.

On the client side, a predictive engine assigns a rating to a newunrated item that is predictive of how much that item will appeal to theuser, based on the rated items 15 and the correlating items table, thatdescribes the correlations between items. Provided below is an algorithmfor rating an unrated item.

-   1. Get an unrated item (program).-   2. Search for the item in the correlating items table. If not found,    no prediction is made—Go to 1.-   3. Create a work list of correlation factors for all correlating    items the user has rated, together with the user ratings.-   4. If the work list is empty, no prediction is made. Go to 1.-   5. Make sure the work list contains a fixed number of items.    -   If the work list exceeds the fixed number, remove those that        relate to the worst correlating item (program).    -   If the list is too short, pad it with correlation factors of 1        and a neutral rating.    -   The fixed length selected is a matter of design choice; the        intent is to provide a fixed-length list of input so that        results can be compared fairly when predicting for different        unrated items, when the amount of data available for prediction        varies.-   6. The sum of (rating*correlation) of all items in the work list,    divided by the sum of the absolute values of correlation factors for    all items constitutes a prediction a rating for the item that is    predictive of the degree to which the item will appeal to the user.    Enhancements to Ensure Privacy

As described above, the system is designed to safeguard the user'sanonymity. There are many clients, and over time more and more programsare rated. When a particular client has rated a few more programs, theserver would need to include that input in the Pair Matrices to furtherincrease the accuracy and scope of the correlations. Conventionally,only the new data would be transmitted to the server to permit it to dothe work of updating the pair matrices. However, to do that, the serverwould need to save the state for each client and identify the client inorder to know all the items rated by that client. Privacy requirementsdisallow any method of identifying the client when accepting input.Therefore, the rated items list is sent in its entirety, on a periodicbasis. Clients use the same time window for sending in their lists, atrandomly chosen times. The server accepts input as described earlier,keeping counts in the matrices. In addition, the server compensates forthe repetitive input by normalizing the counts. Normalization involvesdividing all counts by a factor that keeps the counts constant, as ifall clients kept an unchanging list of rated items. In this way, as theclients slowly grow and alter their lists, the counts on the server willslowly adapt to the new state, and correlation factors will stay up todate, even as the general opinion among users changes. Such a gradualshift in opinion among television viewers occurs inevitable astelevision series gradually change, one program becoming more like someprograms and becoming less like others. In effect, it allows thetracking of correlation over time, without tracking the actual changesto a particular client's ratings.

Adaptive Filtering Algorithm

As indicated previously, items may be rated by one of two algorithms,either the collaborative filtering algorithm described above, or anadaptive filtering algorithm incorporating a naïve Bayes classifier,described in greater detail herein below. It is has been empiricallydetermined that collaborative filtering is a more reliable predictor ofitems that are likely to appeal to the user than the adaptive filteringalgorithm. Accordingly, it is preferable that the collaborative filterrates an item. However, in the absence of collaborative filtering data,a heuristic passes invokes the adaptive filtering algorithm.Collaborative filtering data may not be present if the client has beenunable to contact the server. The adaptive modeling algorithm works byusing content-based filtering. In particular, it uses a program'sfeatures and the user's previously expressed preferences on individualprogram features to arrive at a prediction of how much the user wouldlike the program. As previously noted, programs may be described interms of attributes: actor, genre and director, for example. Generally,each attribute may have several values. For instance, ‘genre’ may haveany of several hundred values; ‘actor’ may have any of several thousandvalues. Each individual value, or attribute-value pair, may be seen as adistinct program feature. Thus, genre: ‘situation comedy’ is a feature;actor: ‘Jennifer Anniston’ is another feature. During the teachingphase, the user may have rated ‘Jennifer Anniston’ 1 thumb up and‘situation comedies’ 2 thumbs up. Based on the user's expressedpreferences, and program information from the TRIBUNE MEDIA SERVICES(TMS) database, which indicates that the genre feature for “Friends” is‘situation comedy’ and the actor feature is ‘Jennifer Anniston,’ thesystem would assign 1 thumb up to “Friends.”

In the example above, it is worth noting that, in the presence of anactor having a 1 thumb up rating and a genre having a 2 thumbs uprating, the system assigned the program 1 thumb up, rather than 2,indicating that the actor rating was weighted more heavily than thegenre rating in predicting the program rating. A feature's specificityis an important determinant of the weight given to it in computing aprogram rating. It is apparent, that, in considering a population offeatures for a pool of programs, a specific actor occurs less frequentlyin the population than a genre. Thus, in the example, the actor‘Jennifer Aston’ would occur less frequently than genre ‘situationcomedy.’ Thus, if a feature is rare in a population of features, andoccurs in a description of a program, because of its rarity in thegeneral population, it is probable that it is highly relevant to theprogram description. Accordingly, the less likely that a feature occursin a general population of features, the more heavily weighted thefeature will be in predicting a rating for the program having thefeature. Thus, a specific actor may occur across several differentgenres, and genre will be weighted less heavily than actor forprediction of program ratings.

Significantly, the foregoing discussion has been directed to explicitfeature ratings given by the user. It is preferable that new programs berated according to explicitly stated user preferences. When the adaptivemodeling algorithm initializes, the program features are evaluated. Ifthe user has explicitly rated even one of the program features, theexplicit user ratings are utilized to compute a rating for the program.In the absence, however of explicit feature ratings, the adaptivemodeling algorithm employs a naïve Bayes classifier to infer featureratings, and compute a program rating based on the inferred ratings. Itis fundamental to the invention that user ratings always take precedenceover inferred ratings. Thus, if the user has rated even one feature of aprogram, the naïve Bayes classifier is not invoked and the one ratedfeature is employed to compute the rating for that program. Inferencehappens as follows: if the user assigns an overall rating to a program,for example, “Cheers,” the system evaluates the separate features of“Cheers” and assigns ratings to the feature based on the user's overallrating of the program. The inferred feature ratings are then used tocompute a rating for a new, unrated program. The process of generatinginferred ratings is described in greater detail below.

The system keeps a tally of how often a feature of an item occurs in apopulation of rated items, and the rating given to the item by the user.For example, a user may rate a program starring Jennifer Anniston twothumbs up. Another program with another actor may be rated one thumb up.The system keeps a tally of the number of times a feature occurs in thepopulation of rated programs, and what rating the program received, andmakes an intelligent conclusion about the significance of a feature tothe user, based on the feature occurrences and the ratings. Thus, if inthe population of rated programs, Jennifer Anniston occurred ten times,each time in a program receiving two thumbs up, then the probabilitywould be high that any other program she occurred in would receive twothumbs up.

A stepwise description of a naïve Bayes classifier, according to theinvention follows:

As indicated above, a pool of items (programs) exists. The user of thesystem assigns ratings to a subset of that pool using a discrete numberof levels. In the preferred embodiment, the rating is expressed as anumber of thumbs up or down, corresponding to ratings of −3 to 3; toavoid ambiguity, the rating ‘0’ is omitted from calculations. The itemsare described in terms of predefined attributes. For example: the mainactor, the genre, the year the program was released, the duration, thelanguage, the time it is aired, the tv channel it is broadcast on, howmuch of it has been viewed, and so on. Each attribute may have aplurality of values. For each value of each attribute, a vector C_(x) isdefined (C_(O) is the first attribute, C₁ the second, C₂ the third . . .etc.). The length of each vector is set as the number of discrete ratinglevels; in the preferred embodiment, six. The vectors are used to keeptrack of the frequency of that feature (attribute-value pair) and therating for each occurrence. A special global vector is kept to track theoverall number of items that receive a particular rating, regardless ofthe features in that item. Where P is a prediction vector, C₁ . . .C_(n) is the feature vector for feature n, G is the global vector andthere are m rating levels, P is calculated according to:

$\begin{matrix}\begin{matrix}{{P(0)} = \frac{{C_{1}(0)}*{C_{2}(0)}*{C_{3}(0)}\mspace{11mu}\cdots\mspace{11mu}{C_{n}(0)}}{G(0)}} \\{{P(1)} = \frac{{C_{1}(1)}*{C_{2}(1)}*{C_{3}(1)}\mspace{11mu}\cdots\mspace{11mu}{C_{n}(1)}}{G(1)}} \\{{P(2)} = \frac{{C_{1}(2)}*{C_{2}(2)}*{C_{3}(2)}\mspace{11mu}\cdots\mspace{11mu}{C_{n}(2)}}{G(2)}} \\{\vdots} \\{{P\left( {m - 1} \right)} = \frac{{C_{1}\left( {m - 1} \right)}*{C_{2}\left( {m - 1} \right)}*{C_{3}\left( {m - 1} \right)}\mspace{11mu}\cdots\mspace{11mu}{C_{n}\left( {m - 1} \right)}}{G\left( {m - 1} \right)}}\end{matrix} & (1)\end{matrix}$

The rating where P shows a maximum value is the most probable rating forthe feature. The distribution of the values in P is an indicator ofcertainty.

In classical implementations of the naïve Bayes classifier, it isassumed that for every attribute, only one value may occur at a time,like the color for a car. However for a program, multiple values mayoccur simultaneously. For example: for the attributes actors, directors,writers and genres, there are multiple simultaneous values, andsometimes none at all. For the purpose of predicting program ratings,the values lists for the various attributes are collapsed into aggregategenre and cast vectors for those features that occur in the program tobe rated. Thus, for the purpose of predicting program ratings, twoattributes are created, cast and genre.

The two attributes are different in nature, and the method ofcollapsing, or combining all values for an attribute is different. Forthe genre attribute, vectors are summed, the population of genresconsisting of only a few hundred separate values. Due to the largepopulation of actors, directors and writers, possibly numbering in theten's of thousands, they are combined by taking the maximum frequency ateach rating level. Additionally, actors are often clustered.

With the number of attributes being reduced to two, the possibilityexists that either attribute can have 0 occurrences of matchingpreferences. Thus the input to the formula may be 0, 1 or 2 attributes.If there is only data for one attribute, the global vector is ignored,and only the product term is considered.

As mentioned above, a measure of confidence may be derived byconsidering the distribution of values in P. However, in the earlystages of using the system, the number of items that have been rated issmall, which produces extremes in the distribution of values, and wouldlead to unreliable confidence ratings. Instead, the confidence ratingmay be based on the amount of evidence for the winning rating level. Ifthe maximum rating level is x, then all counts for every category (x)for all genres and all actors are summed. This value is called P(x)e,the total amount of evidence for P(x). A global vector, H, is kept totrack the highest count for each winning rating level that has beenfound during system operation. C(x)e is normalized against H(x) so thatthe confidence is only high when it is a record or near record for thatbin. This system still favors 1 in the early use of the system. To giveit a more sensible start, without affecting long-term behavior, aLaPlace method is used that avoids anomalies where there is very littledata. Instead of confidence=P(x)e/H(x), confidence=(P(x)e+1)/(H(x)+2) isused. This makes it start at 0.5 initially, and the restriction onconfidence relaxes as evidence grows.

To summarize the foregoing: the collaborative filtering algorithm is thepreferred method of predicting program ratings. In the absence ofup-to-date correlation factors provided by the server, a content-basedadaptive filtering algorithm predicts program ratings. The features of aprogram to be rated are evaluated. If the user has explicitly rated anyof the program's features, the user ratings are employed to predict arating for the program. In predicting a rating, features of highspecificity are more heavily weighted than those of low specificity. Inthe event that the user hasn't rated any of program's features, amodified naïve Bayes classifier calculates the of probability of theprogram being assigned a particular rating, using inferred featureratings derived from previous user ratings of programs. A probabilityvector allows a confidence level to be expressed for the predictedrating.

Display of Rated Items

As FIG. 2 shows, rated items are displayed as a list of suggestions.Both user rated-items and items carrying predicted ratings are listed inthe same display. As previously described, user ratings aredifferentiated from predicted ratings through the use of distinctratings icons. The output from both prediction engines, thecollaborative filtering engine and the content-based engine is scaledand integrated into the same list. As previously described, the ratingsrange from three thumbs up to three thumbs down, with three thumbs upbeing most favorable, and three thumbs down being most unfavorable,corresponding to discrete numerical ratings in a range of −3 to 3. Fordisplay, the items are sorted in descending order from most favorable toleast favorable. Within each discrete rating, the items are sorted indescending order according to confidence of the prediction, theconfidence level ranging from 1 to −1. Confidence level values are notnecessarily discrete values, as the ratings are. It will be apparentthat explicit user ratings have the highest confidence level, thus theyare listed first within the ratings groups.

Although the invention has been described herein with reference tocertain preferred embodiments, one skilled in the art will readilyappreciate that other applications may be substituted for those setforth herein without departing from the spirit and scope of the presentinvention. Accordingly, the invention should only be limited by theClaims included below.

1. A distributed system for predicting items for a user, the systemcomprising: a client; a server in communication with a plurality ofclients, including the client, over a network connection; device logicat the server that periodically receives a list of user-rated items fromeach client of the plurality of clients, the lists of user-rated itemsaggregated into a single aggregated list of items, the items beingassociated with media content, the user-rated items in the list ofuser-rated items from the client being rated by a user, the user-rateditems in the lists of user-rated items from other clients of theplurality of clients being rated by other users; logic at the serverthat filters user-rated items from the single aggregated list of itemsbased on frequency by monitoring frequency of the user-rated items anddiscarding items and the items' corresponding user-ratings that do notsatisfy a threshold frequency; logic at the server that createsmatrices, each matrix of the matrices corresponding to each unique pairof items from the single aggregated list of items, each matrix storinguser-ratings for each item of the unique pair of items, the matricesanonymous with respect to the user and the other users; logic at theserver that computes a rating correlation between items of the uniquepair of items from each matrix; logic at the server that filters outnon-significant rating correlations of unique pairs of items; logic atthe server that compiles a list of correlating items comprising forwhich the rating correlations were not filtered out; logic at the serverthat periodically sends the list of correlating items to the client; andlogic at the client that predicts a rating for an unrated item based onthe correlations provided in the list of correlating items.
 2. Thesystem of claim 1, wherein the connection between any of the pluralityof clients and the server is stateless.
 3. The system of claim 1,wherein the list of user-rated items received by the server from theclient includes each item rated by the user and the corresponding ratingassigned by the user.
 4. The system of claim 3, wherein thecorresponding rating assigned by the user is assigned according to ascalar system of discrete ratings.
 5. The system of claim 1, whereindimensions of each matrix of the matrices are n by n, where n equals anumber of discrete user-ratings.
 6. The system of claim 1, wherein theuser-ratings are selected from a group −3, −2, −1, 0, 1, 2, and 3, and anegative value indicates an unfavorable rating, a positive valueindicates a favorable rating, and a ‘0’ indicates a neutral rating. 7.The system of claim 5, wherein n=7.
 8. The system of claim 1, whereinpairs of user-ratings are saved to the matrices by: retrieving a pair ofuser-ratings for a pair of items; determining whether a matrix storinguser-ratings for the pair of items exists; creating the matrix for thepair of items if the matrix does not exist; locating a cell that storesthe pair of user-ratings in the matrix by using the pair of user-ratingsas an index to the matrix; and incrementing the cell's value thatcorresponds to the pair of user-ratings by
 1. 9. The system of claim 1,wherein a correlation factor is calculated for each matrix.
 10. Thesystem of claim 9, where the correlation factor is of the range of −1to
 1. 11. The system of claim 9, wherein a positive correlation factorindicates a positive correlation and a negative correlation factorindicates an anti-correlation.
 12. The system of claim 1, wherein thelogic at the server that filters non-significant correlations comprisesa correlation filter that filters item pairs that correlate weakly. 13.The system of claim 12, wherein a correlation factor having a non-zerovalue that approaches zero indicates a weak correlation between items ofa pair.
 14. The system of claim 1, wherein the logic at the client thatpredicts a rating for an unrated item comprises a prediction engine, theprediction engine predicting a rating for an unrated item by: receivingan unrated item; searching for the unrated item in the list ofcorrelating items; creating a work list of correlation factors for allitems in the list of correlating items correlating to the unrated item,that have been rated by the user; insuring that the work list is of apredetermined, fixed length by removing most weakly correlating itemswhen the work list exceeds the predetermined fixed length and paddingthe work list with correlation factors of 1 and neutral ratings when thework list is shorter than the predetermined fixed length; multiplying avalue for the user-rating and the correlation factor for each item inthe work list and summing products of the value for the user-rating andthe correlation factor; and dividing a sum of the products of the valuefor the rating and the correlation factor by a sum of absolute values ofcorrelation factors for each item in the work list, a resulting valueconstituting a predicted rating for the unrated item.
 15. The system ofclaim 1, wherein the server comprises a single computer having multiplefunctional units.
 16. The system of claim 1, wherein the servercomprises a single server having a plurality of functional units andseparate stages of the system are implemented in separate functionalunits.
 17. The system of claim 1, wherein the server comprises aplurality of computers and separate stages of the system are implementedon separate computers, the computers being in communication with otherseparate computers of the server.
 18. The system of claim 1, wherein themedia content comprises any of network television programming, cabletelevision programming, films, pay-per-view television programming andvideo.
 19. The system of claim 1, the system being a component of anetwork-based video recording system.
 20. A method of compiling a listof correlated items, the method comprising the steps of: periodicallyreceiving at a server a list of items rated by the user from a client ofa plurality of clients over a stateless network connection, the itemsbeing associated with media content; aggregating the list of items atthe server with lists from other clients of the plurality of clients incontact with the server into a single aggregated list of items, theitems in the lists from the other clients rated by other users of aplurality of users; filtering the single aggregated list of items bymonitoring frequency of user-rated items, and discarding items andcorresponding user-ratings that do not satisfy a threshold frequency;tallying user-ratings for each item of each unique pair of items fromthe single aggregated list of items and storing tallies in one or morepair matrices, each pair matrix corresponding to each unique pair ofitems, the one or more pair matrices anonymous with respect to users ofthe plurality of users; computing a rating correlation between items ofthe unique pair of items from each pair matrix; filtering outnon-significant rating correlations of unique pairs of items; compilinga list of correlating items comprising items for which the ratingcorrelations were not filtered out; and periodically sending by theserver the list of correlating items to the client of the plurality ofclients.
 21. The method of claim 20, wherein the list of user-rateditems received at the server comprises the list of user-rated items fromthe client of the plurality of clients, and the list of user-rated itemsfrom the client includes each item rated by the user and thecorresponding rating assigned by the user.
 22. The method of claim 20,wherein the user rates the items according to a scalar system ofdiscrete ratings.
 23. The method of claim 20, wherein dimensions of theone or more pair matrices are n by n, where n equals a number ofdiscrete user-ratings.
 24. The method of claim 23, wherein n=7.
 25. Themethod of claim 20, wherein the step of tallying user-ratings for eachitem and storing tallies in one or more pair matrices further comprisesthe steps of: retrieving a pair of user-ratings for a pair of items;determining whether the matrix storing ratings of the pair of itemsexists; creating the matrix storing the pair of items if the matrix doesnot exist; locating a cell that stores the pair of user-ratings in thematrix by using the pair of user-ratings as an index to the matrix; andincrementing the cell's value that corresponds to the pair ofuser-ratings by
 1. 26. The method of claim 20, wherein a correlationfactor is calculated for each matrix.
 27. The method of claim 26,wherein the correlation factor is of the range of −1 to
 1. 28. Themethod of claim 26, wherein a positive correlation factor indicates apositive correlation and a negative correlation factor constitutes ananti-correlation.
 29. The method of claim 20, wherein the step offiltering out non-significant correlations comprises the step of:filtering out correlations with a correlation factor having a non-zerovalue that approaches zero.
 30. The method of claim 20 furthercomprising a client predicting a rating for an unrated item comprising:searching for the unrated item in the list of correlating items;creating a work list of correlation factors from all items in the listof correlating items correlating to the unrated item, that have beenrated by the user; insuring that the work list is of a predetermined,fixed length; multiplying a value for the user-rating and thecorrelation factor for each item in the work list, and summing productsof the value for the user-rating and the correlation factor; anddividing the sum of the products of the value for the user-rating andthe correlation factor by a sum of absolute values of correlationfactors for each item in the work list, a resulting value constituting apredicted rating for the unrated item.
 31. The method of claim 30,wherein the step of insuring that the work list is of a predetermined,fixed length comprises one of the steps of: removing most weaklycorrelating items when the work list exceeds the predetermined fixedlength; and padding the work list with correlation factors of one andneutral ratings when the work list is shorter than the predetermined,fixed length.
 32. The method of claim 20, wherein the server comprises asingle server having a plurality of functional units and separate stagesof the method are executed in separate functional units.
 33. The methodof claim 20, wherein the server comprises a plurality of computers andseparate stages of the method are executed on separate computers, thecomputers being in communication with other separate computers of theserver.
 34. The method of claim 20, wherein the media content comprisesany of network television programming, cable television programming,films, pay-per-view television programming, and video.
 35. The method ofclaim 20, wherein the method is implemented within a network-based videorecording system.